Modeling for Predicting the Potential Geographical Distribution of Three Ephedra Herbs in China

Ephedra species are beneficial for environmental protection in desert and grassland ecosystems. They have high ecological, medicinal, and economic value. To strengthen the protection of the sustainable development of Ephedra, we used occurrence records of Ephedra sinica Stapf., Ephedra intermedia Schrenk et C.A. Mey., and Ephedra equisetina Bge., combined with climate, soil, and topographic factors to simulate the suitable habitat of three Ephedra based on ensemble models on the Biomod2 platform. The results of the models were tested using AUC, TSS, and kappa coefficients. The results demonstrated that the ensemble model was able to accurately predict the potential distributions of E. sinica, E. intermedia, and E. equisetina. Eastern and central Inner Mongolia, middle and eastern Gansu, and northeastern Xinjiang were the optimum regions for the growth of E. sinica, E. intermedia, and E. equisetina, respectively. Additionally, several key environmental factors had a significant influence on the suitable habitats of the three Ephedra. The key factors affecting the distribution of E. sinica, E. intermedia, and E. equisetina were annual average precipitation, altitude, and vapor pressure, respectively. In conclusion, the results showed that the suitable ranges of the three Ephedra were mainly in Northwest China and that topography and climate were the primary influencing factors.


Introduction
Ephedra species are perennial herbs of the Ephedra genus in Ephedraceae. Members of this genus are commonly found in dry wastelands, riverbeds, and grasslands. They are important species associated with sandy grassland and frequently form a large area of a simple community [1]. These plants are important sand fixation plants in Northwest China and assist with water and soil conservation, desertification prevention, and environmental improvement because of their high drought tolerance and ability to withstand extreme temperature and saline alkali [2]. Fifteen Ephedra species and four varieties can be found in China, and E. sinica, E. intermedia, and E. equisetina are the main varieties used in medicine [3]. Three Ephedra herbs have been widely used as traditional Chinese medicine (TCM) since ancient times [4] and are used to primarily treat colds, coughs, and asthma [5]. The medicinal effect of Ephedra cannot be achieved without active substances. Ephedra herbs contain a variety of alkaloids, volatile substances, flavonoids, sugars, minerals, among which the content of alkaloids is the highest, and alkaloids are the main active components of Ephedra [6]. Modern pharmacological studies have revealed that Ephedrine has the functions of favoring sweating, alleviating colds, promoting lung health, relieving asthma, and detumescence. Many effects of Ephedra are associated with ephedrine-type alkaloids. The effects of various secondary metabolites differ. For example, ephedrine is an effective component for ephedra to play analgesic, regulate blood pressure, stimulate the central

The SDM and Its Accuracy
The average AUC, TSS, and kappa values of single models and ensemble models for three Ephedra species were evaluated on the Biomod2 platform (Table 1). Among the 10 single models, RF, GBM, and GLM had high accuracy and excellent performance in predicting the potential distribution areas of E. sinica, E. intermedia, and E. equisetina, while SRE had the worst accuracy. The average AUC, TSS, and kappa values for E. sinica's ensemble model were 0.97, 0.84, and 0.79, respectively; those for E. intermedia were 0.98, 0.87, and 0.81, respectively; and those for E. equisetina were 0.98, 0.91, and 0.87, respectively. Compared with single models, the ensemble models for three Ephedra species had significantly improved AUC, TSS, and kappa values, indicating that the ensemble models could more accurately predict potentially suitable habitats for E. sinica, E. intermedia, and E. equisetina.

Potential Suitable Area Distribution for Three Ephedra Herbs Species
The highly suitable area (0.6-0.8) and most suitable area (>0.8) were selected as important suitable habitats (ISHs) for the three Ephedra herbs. According to the results of the ensemble model in Biomod2 (Figure 2

Potential Suitable Area Distribution for Three Ephedra Herbs Species
The highly suitable area (0.6-0.8) and most suitable area (>0.8) were selected as important suitable habitats (ISHs) for the three Ephedra herbs. According to the results of the ensemble model in Biomod2 (Figure 2   We also carried out statistics and analysis on the area of three kinds of Ephedra herbs in different administrative regions with different levels of suitability ( Figure 3). As shown in Figure 3a, Inner Mongolia had the largest total suitable area for E. sinica, with approximately 106.53 × 10 4 km 2 , followed by Xinjiang and Gansu, with 59.16 × 10 4 km 2 and 26.81 × 10 4 km 2 , respectively. Although the total area of suitable areas in Xinjiang was relatively high, most of these sites had low suitability and were not recommended for the growth of E. sinica. According to the percentage of suitable areas (more than 80%) and the area of the ISH, Inner Mongolia, Gansu, Hebei, Shanxi, Shaanxi, and Ningxia were most suitable for the development of E. sinica. In Figure 3b, the total suitable area for E. intermedia was the highest in Xinjiang, with approximately 104.16 × 10 4 km 2 , followed by Inner Mongolia and Gansu, with values of 70.70 × 10 4 km 2 and 38.72 × 10 4 km 2 , respectively. The percentage We also carried out statistics and analysis on the area of three kinds of Ephedra herbs in different administrative regions with different levels of suitability ( Figure 3). As shown in Figure 3a, Inner Mongolia had the largest total suitable area for E. sinica, with approximately 106.53 × 10 4 km 2 , followed by Xinjiang and Gansu, with 59.16 × 10 4 km 2 and 26.81 × 10 4 km 2 , respectively. Although the total area of suitable areas in Xinjiang was relatively high, most of these sites had low suitability and were not recommended for the growth of E. sinica. According to the percentage of suitable areas (more than 80%) and the area of the ISH, Inner Mongolia, Gansu, Hebei, Shanxi, Shaanxi, and Ningxia were most suitable for the development of E. sinica. In Figure 3b, the total suitable area for E. intermedia was the highest in Xinjiang, with approximately 104.16 × 10 4 km 2 , followed by Inner Mongolia and Gansu, with values of 70.70 × 10 4 km 2 and 38.72 × 10 4 km 2 , respectively. The percentage of suitable regions and ISH areas were evaluated, and Xinjiang, Gansu, Inner Mongolia, Shaanxi, and Ningxia were most suitable for planting E. intermedia. In Figure 3c, the total suitable area for E. equisetina was the highest in Xinjiang (118.25 × 10 4 km 2 ), followed by Inner Mongolia (108.03 × 10 4 km 2 ). The total suitable area was high in Qinghai, Tibet, and Sichuan, but the proportion of areas with low suitability was high, making it unsuitable for the growth and survival of E. intermedia. However, Xinjiang, Inner Mongolia, Gansu, and Ningxia had large ISH areas for E. equisetina, accounting for a relatively high proportion. Through a comprehensive ISH evaluation, the proportion of suitable areas and actual local development were used to obtain the final suitable area for the development of three kinds of Ephedra. Inner Mongolia had the widest suitable habitat of E. sinica, accounting for more than 80%. The eastern part of Inner Mongolia was the most suitable area for the Through a comprehensive ISH evaluation, the proportion of suitable areas and actual local development were used to obtain the final suitable area for the development of three kinds of Ephedra. Inner Mongolia had the widest suitable habitat of E. sinica, accounting for more than 80%. The eastern part of Inner Mongolia was the most suitable area for the conservation and restoration of E. sinica, followed by northern Hebei, northern Shanxi, northern Shaanxi, Ningxia, and central Gansu. The ISH area for E. intermedia was the largest in Gansu, and the middle and eastern parts of Gansu were the most suitable areas as priority sites of E. intermedia, followed by central Inner Mongolia, Ningxia, and northern Xinjiang. The ISH area for E. equisetina was the most extensive in Xinjiang, and the northern part was the most suitable area for the conservation and restoration of this species, followed by the central and western parts of Inner Mongolia and central parts of Gansu.

Discussion
Using the ensemble model on the Biomod2 platform, this study investigated the potential distribution of three Ephedra herbs. Based on occurrence records and environmental factors, ensemble models predicted the habitat suitability maps for E. sinica, E. intermedia, and E. equisetina, with excellent performance measured by the AUC, TSS, and kappa values. As a result, we believe our model results are robust and adequate for constructing the overall suitable habitat distribution of Ephedra herbs in China.

Model Results and Verification
According to Flora of China and previous studies, E. sinica is mainly distributed in Liaoning, Inner Mongolia, Hebei, Shanxi, Shaanxi, Shandong, Gansu, Qinghai, and Xinjiang; E. intermedia is distributed in Liaoning, Inner Mongolia, Hebei, Shandong, Shanxi, Shaanxi, Gansu, Ningxia, Qinghai, Xinjiang; and E. equisetina is found in Inner Mongolia, Hebei, Shanxi, Shaanxi, Gansu, Ningxia, Xinjiang [38,39], which was very similar to the ISH range detected in our research. This similarity verifies the accuracy of the results of ensemble models on the Biomod2 platform in predicting species distribution. Furthermore, some researchers used the MaxEnt model to predict the suitable areas for the three Ephedra herbs and discovered that E. sinica was distributed mainly in central and eastern Inner Mongolia, as well as in eastern Gansu. E. intermedia was primarily found in central Gansu Province, with sporadic occurrences in eastern Qinghai and Xinjiang. E. equisetina was found in western Inner Mongolia, central Gansu, and northern Xinjiang [40,41], with a much smaller range than that reported in Flora of China and previously recorded. Obviously, the Biomod2 platform is better suited for predicting the suitable distribution habitat of the three species, whereas the MaxEnt model is overly cautious.
Different models have different algorithms and simulation processes, and the prediction results will thus be different [42]. The AUC, TSS, and kappa metrics were chosen as the test values to evaluate the model in our study, and the values were used to effectively evaluate and compare model performance. In single models, RF, GBM, and GLM performed well in predicting the potential distribution of E. sinica, E. intermedia, and E. equisetina, while SRE performed the worst. The accuracy of the ensemble models after screening and integration was higher than that of all single models, which can better predict the range of suitable areas for Ephedra resources. This result was similar to that in previous studies on species such as Salvia miltiorrhiza Bunge, Paeonia lactiflora Pall, and Plateau pika (Ochotona curzoniae) [43][44][45]. However, due to the diversity of species growth characteristics and calculation parameters, the matching degree between niche models and species cannot rely only on simple test values. In the future, we should also consider other indicators comprehensively to test the model [46].

Effects of Environmental Variables on Three Ephedra Herbs
Climate, topography, and soil play key roles in plant survival, especially for species in arid areas with harsh habitats. Ephedra herbs are xerophytic plants with perennial roots persisting in the soil for many years. Appropriate temperature and water are the basic conditions for the physiological activities of Ephedra herbs. Furthermore, topographic factors such as slope, aspect, and elevation, affect plant growth by influencing regional temperature, hydrology, and soil. Previous studies showed that precipitation in the warmest season was a key environmental factor affecting the distribution of E. foliata, and the average annual temperature was the main driving force affecting the distribution of E. gerardiana [47]. Soil was the main factor influencing the distribution of E. strobilacea [48]. In addition, for the co-occurring species of Ephedra, Rosa arabica Crep., average annual precipitation, elevation, and average annual temperature were important drivers determining its distribution [49]. Similar to previous findings [41], the results of our study revealed that average annual precipitation was the most important environmental factor influencing the distribution of E. sinica, elevation was the most important driver influencing the distribution of E. intermedia, and E. equisetina was most affected by vapor pressure. In summary, topographic and climatic factors are important factors affecting the distribution of Ephedra.
Our study found that there were large areas of low-suitability habitats for these three Ephedra in several provinces, such as Yunnan and Tibetan autonomous regions, which indicated that Ephedra herbs had strong adaptability to the environment. However, because of the complexities of the impact of environmental factors on plants, it was difficult to obtain all of the environmental variables that affect the distribution of these three Ephedra herbs and to precisely define their suitability areas. Furthermore, when applying the results to actual planting and restoration, we must consider land occupation as well as the impact of the surrounding environment, such as water quality, vegetation coverage, and human activities. In this study, for example, the suitable areas for E. sinica in Beijing and Tianjin were relatively high, but the two cities have more construction land, which is not suitable for the planting and development of E. sinica.

Conservation Strategies for Ephedra
Our research identified suitable habitats for three Ephedra herbs by constructing ensemble models to predict the potential distribution of E. sinica, E. intermedia and E. equisetina. In the suitable habitat areas of the three species, the protection of Ephedra herbs resources should be strengthened. Relevant policies and measures to prohibit the destruction of wild Ephedra herbs should be issued, and wild resources can be protected in situ by dividing protected areas [50]. In addition, encouraging the artificial cultivation of Ephedra herbs and introducing wild Ephedra resources into cultivation are important methods that can be used to protect important endangered resources. E. sinica had been artificially introduced for many years, but the other two species were rarely introduced. The government within the scope of important suitable habitat should encourage local farmers to cultivate E. sinica, E. intermedia, and E. equisetina. Meanwhile, farmers' enthusiasm for planting is affected by market price fluctuations, and one-time planting is easily occurred, resulting in an unstable supply of Ephedra medicinal materials [51]. Given the circumstances described above, the government can encourage farmers' enthusiasm for Ephedra cultivation through policy and economic means, promote the balance of planting three Ephedra herbs, and promote the protection of wild Ephedra resources. Meanwhile, proper harvesting methods are advantageous to the long-term utilization of Ephedra resources. Ephedra herbs should be harvested in the middle of the branches or near the head, as harvesting close to the root can kill the entire plant [52]. Therefore, farmers must be trained in harvesting to ensure that Ephedra can be regrown the following year and that Ephedra resources are not depleted.

Occurrence Collection
The Global Biodiversity Information Network Database (GBIF, http://www.gbif.org/ accessed on 1 January 2022) China Digital Herbarium (CVH, http://www.cvh.org.cn/ accessed on 1 January 2022) and previous studies [53][54][55] were used to collect distribution data for E. sinica, E. intermedia, and E. equisetina in China for our study. Google Maps was used to locate and obtain coordinate information for samples that lacked geographic coordinate data, and we removed data for which geographic coordinates could not be obtained as well as duplicate data. Using the ArcGIS fishing net tool, we filtered the data in 10 km × 10 km cells to ensure that each Ephedra species had only one sample point in each cell. This process could effectively avoid overfitting the model. Finally, a total of 109, 113, and 85 occurrence records of E. sinica, E. intermedia, and E. equisetina, respectively, were obtained for constructing ensemble models (Figure 4). The three Ephedra herbs occurrence data were output as longitude and latitude coordinates and saved as CSV files for subsequent analysis. In addition, when using Biomod2 to analyze the potential habitat of species, the species' nonexistence points must be obtained, and these data are typically difficult to obtain. Therefore, we used the model's default method 'random' to generate pseudo-absence points for the three species, and parallel screening was performed three times [43]. difficult to obtain. Therefore, we used the model's default method 'random' to generate pseudo-absence points for the three species, and parallel screening was performed three times [43].

Environmental Parameters
Ephedra herbs grow in sandy soil with good air permeability and are mainly found in dry wastelands, riverbeds, and grasslands in arid and semiarid areas. We used three groups of environmental variables based on the distribution characteristics of Ephedra herbs: climate factors, soil factors, and topographic factors. Climate factors include solar radiation (srad), vapor pressure (vapr), and 19 bioclimatic variables (bio1~bio19), all of which are standard annual average data from 1970 to 2000 and can be downloaded from the World Climate Database (http://worldclim.org/ accessed on 1 December 2021). Soil factors included soil types (soil), soil particle-size distribution dataset (clay1, clay2, sand1, sand2) [56], and soil quality data (sq1~sq7). The Resource and Environmental Science and Data Center (https://www.resdc.cn/ accessed on 11 June 2022) provided the soil type data. The National Qinghai Tibet Plateau Scientific Data Center (http://data.tpdc.ac.cn/zh-hans/ accessed on 10 March 2022) provided a soil particle-size distribution dataset. The World Soil Database (https://www.fao.org/soils-portal/ accessed on 12 March 2022) provided soil quality data. Elevation (ele), slope (slop), and aspect (asp) were topographic factors, with elevation data derived from the ENVIREM dataset (Environmental Rasters for Ecological

Environmental Parameters
Ephedra herbs grow in sandy soil with good air permeability and are mainly found in dry wastelands, riverbeds, and grasslands in arid and semiarid areas. We used three groups of environmental variables based on the distribution characteristics of Ephedra herbs: climate factors, soil factors, and topographic factors. Climate factors include solar radiation (srad), vapor pressure (vapr), and 19 bioclimatic variables (bio1~bio19), all of which are standard annual average data from 1970 to 2000 and can be downloaded from the World Climate Database (http://worldclim.org/ accessed on 1 December 2021). Soil factors included soil types (soil), soil particle-size distribution dataset (clay1, clay2, sand1, sand2) [56], and soil quality data (sq1~sq7). The Resource and Environmental Science and Data Center (https://www.resdc.cn/ accessed on 11 June 2022) provided the soil type data. The National Qinghai Tibet Plateau Scientific Data Center (http://data.tpdc.ac.cn/zh-hans/ Plants 2023, 12, 787 9 of 14 accessed on 10 March 2022) provided a soil particle-size distribution dataset. The World Soil Database (https://www.fao.org/soils-portal/ accessed on 12 March 2022) provided soil quality data. Elevation (ele), slope (slop), and aspect (asp) were topographic factors, with elevation data derived from the ENVIREM dataset (Environmental Rasters for Ecological Modeling, https://envirem.github.io/ accessed on 12 March 2022)), and we extracted the slope and aspect variables from altitude data by using ArcGIS [57].
These environmental parameters were preprocessed to a general spatial resolution of 30" latitude/longitude (ca. 1 km 2 at ground level). We used ArcGIS to extract environmental variables within the study area and output them in ASCII format to construct the model. Since many environmental factors in the same group are calculated from the same set of basic data, the model inevitably has multicollinearity, which leads to a model test value that is too high, causing overly optimistic results [58]. As a result, we used the "corrplot" package in R to analyze the correlation coefficient ® between each pair of variables and retained variables that were easy to explain and had a high contribution rate when the |r| between variables was greater than 0.75 [30]. Finally, 20 variables were chosen for the next step for E. sinica after removing highly correlated environmental variables; additionally, 17 variables were chosen for E. intermedia, and 19 variables were chosen for E. equisetina ( Table 2).

Model Implementation and Evaluation
We used models in Biomod2 to build ensemble models, and the applicability of different models to the three species was evaluated by calculating the accuracy of the model results, and the optimal ensemble model for each species was constructed.
The occurrence data and corresponding environmental data for E. sinica, E. intermedia and E. equisetina were input into the Biomod2 platform successively, and the distribution data were divided into two parts: 75% of data were randomly selected for modeling, and the remaining 25% of data were used to test the model results. Three sets of pseudononexistent points were randomly generated, and the model was run 10 times. Therefore, 300 single-model running results (10 × 3 × 10) were generated for each Ephedra species.
To evaluate the accuracy and quality of the predictions, area under the curve (AUC), true skill statistics (TSS), and Cohen's kappa coefficient (kappa) were used in our study. The model is prone to over-dependence when there is only a single evaluation index used to evaluate the model, but different indexes have different responses to diagnostic thresholds and species occurrence distribution rates, and the combination of multiple test values can effectively avoid this situation and better evaluate model performance [59][60][61]. AUC is the area under the receiver operating characteristic (ROC) curve, and it is not affected by the diagnostic threshold and species occurrence and distribution rate, and the range is (0, 1). The AUC was used to verify and evaluate the accuracy and robustness of the model. When the AUC value was above 0.9, the model indicated excellent performance, whereas it was not better than random when below 0.5. TSS has the ability to distinguish between "TRUE" and "FALSE" results, which can effectively avoid the unimodal curve response to the incidence of species. However, it is susceptible to the threshold, and the range is (0, 1); Equation (1) was used to calculate TSS as follows: where a refers to the number of true positives, b refers to the number of false positives, c refers to the number of false negatives, and d refers to the number of true negatives. Additionally, Kappa was used to predict the accuracy rate relative to random occurrence. It is affected by the incidence rate and threshold, and the range is (0, 1), and its calculation formula is as follows: where P, Sn, and Sp are the prevalence, sensitivity, and specificity, respectively, P o is the observed accuracy and P e is the accuracy expected to occur by chance [62]. The closer the values of AUC, TSS, and kappa are to 1, the better the result was, and the more accurate the prediction of species distribution was. Values further from one indicate a result that is closer to a random estimate. In our study, AUC, TSS, and kappa were all greater than 0.9, meeting the criteria for screening the results of single models to construct the three combined models. Furthermore, we used a jackknife test to determine the relative importance of the explanatory variables. The potential distribution areas of the three Ephedra species are obtained after running the ensemble model, and the results were projected and output to ASCII format for further analysis in ArcGIS. We reclassified the potential distribution area using ArcGIS, converted the grid value to a range of 0-1, and divided the model-predicted distribution area into five grades: unsuitable (0~0.2), low suitability (0.2~0.4), medium suitability (0.4~0.6), high suitability (0.6~0.8), and most suitable (0.8~1) [63,64].

Conclusions
The ensemble model on the Biomod2 platform was used to simulate and predict the potential suitable habitats of E. sinica, E. intermedia, and E. equisetina, and the effects of environmental factors on the three Ephedra herbs were discussed. In the model evaluation, RF, GBM, and GLM demonstrated the best performance among single models. However, the ensemble model improved the prediction accuracy and more accurately simulated the potential suitable habitats of species compared to the single model. We analyzed key environmental variables affecting the distribution of the three Ephedra herbs and discovered that average precipitation was the key environmental factor affecting the distribution of E. sinica, elevation was the most important environmental factor affecting the distribution of E. intermedia, and vapor pressure was the primary factor affecting the distribution of E. equisetina. The important factors affecting all three Ephedra herbs were annual mean precipitation and elevation. The three Ephedra herbs were mostly found in Northwest China. The central and eastern parts of Inner Mongolia were the most suitable development areas for E. sinica, the central and eastern parts of Gansu were the most suitable habitats for E. intermedia, and northern Xinjiang was the most suitable for the growth of E. equisetina. The findings of our study provide some guidance for Ephedra herbs cultivation and protection and promote the long-term development of grassland and desert ecosystems.